Instructions

Below you will find several empty R code scripts and few places where a line starts with the word “Answer:”. Your task is to fill in the required code and answer the questions as stated.

Eggs Dataset

Today you will be working with a datasets of birds:

Here is a full data dictionary describing all of the variables

Notice that the last two variables are integer codes. They are stored as numbers but correspond to a category.

Starting plot

Create a scatter plot showing the mass of a male bird (x-axis) and the mass of an egg:

You should notice that the plot’s scale makes it hard to see the relationship between the two variables.

Changing the scale

Now add the layers scale_x_log10 and scale_y_log10

How would you now describe the relationship between the two variables (I just need one sentence here)?

Answer: It has a strong positive correlation.

Parrots

Create a new dataset called parrots consisting of just those birds that are parrots (hint: use the type variable; double hint: look at the raw data for exactly how to format the filter query):

## # A tibble: 12 x 10
##    genus    species  name   type  egg_mass male_mass mating_system display
##    <chr>    <chr>    <chr>  <chr>    <dbl>     <dbl>         <int>   <int>
##  1 Aprosmi… erythro… Red-w… Parr…    11.5      135               2       3
##  2 Lathamus discolor Swift  Parr…     5.95      64.7             2       3
##  3 Neophema chrysos… Blue-… Parr…     4.20      45.7             2       1
##  4 Neophema petroph… Rock   Parr…     4.85      53.0             2       1
##  5 Neophema pulchel… Turqu… Parr…     3.90      42.7             2       1
##  6 Neopsep… bourkii  Bourk… Parr…     3.75      46.0             2       1
##  7 Pezopor… wallicus Ground Parr…     6.85      78.0             2       1
##  8 Polytel… alexand… Alexa… Parr…     7.75      96.0             3       3
##  9 Polytel… anthope… Regent Parr…     9.40     175               2       3
## 10 Polytel… swainso… Superb Parr…     8.10     153               2       2
## 11 Psephot… haemato… Red-r… Parr…     4.50      61.4             2       1
## 12 Purpure… spurius  Red-c… Parr…     7.15     117               2       1
## # ... with 2 more variables: resource <int>, clutch_size <dbl>

Now add a layer to the previous plot (keeping the log scales) where the parrots are highlighted in the color “red”. To make them stand out, make the base layer have an alpha value of 0.15. Finally, add a text annotation describing to the reader that the red points are parrots.

Smoothing line

Now, we are going to add a best-fit line to the plot. We do this by adding geom_smooth(method = "lm") to the plot. Add this to the plot using the log-log scale, but without highlighting the parrots.

I think the best-fit is a bit to colorful and noisy. Fix it by changing the line to this instead: geom_smooth(method = "lm", color = "black", se = FALSE, linetype = "dashed", size = 0.5).

Does the best-fit match the visual pattern you saw between the size of a bird and the size of its eggs (again, one sentence is sufficent)?

Answer: Yes, it matches the visual pattern between the size of the bird and its eggs (a positive correlation).

Outliers

If you look at the plot, you’ll see one bird in particular who has a very large egg size given the mass of the bird itself. This is the the Red-tailed tropicbird (also, you can add pictures to Rmarkdown!):

The tropicbird as a male mass of 218.7g and an egg mass of 87.00g. Annotate this point on the graph and give a label for it:

Your turn

Construct one final graph of the data. You are free to use the other variables that we did not look at yet or to look at different classes of birds. For this graph (only), please add an appropriate title and annotations.

## # A tibble: 8 x 10
##   genus   species  name     type  egg_mass male_mass mating_system display
##   <chr>   <chr>    <chr>    <chr>    <dbl>     <dbl>         <int>   <int>
## 1 Aegoli… funereus Boreal   Owl       12.4       101             2       3
## 2 Asio    flammeus Short-e… Owl       21.3       278             2       5
## 3 Asio    otus     Long-ea… Owl       23.0       233             2       4
## 4 Ninox   connive… Barking  Owl       39.3       694             2       1
## 5 Ninox   strenua  Powerful Owl       56.5      1447             2       1
## 6 Strix   aluco    Tawny    Owl       39.0       397             2       3
## 7 Strix   nebulosa Great G… Owl       52.0       884             2       3
## 8 Surnia  ulula    Norther… Owl       21.8       270             2       1
## # ... with 2 more variables: resource <int>, clutch_size <dbl>